Topic Identification Using Wikipedia Graph Centrality
نویسندگان
چکیده
This paper presents a method for automatic topic identification using a graph-centrality algorithm applied to an encyclopedic graph derived from Wikipedia. When tested on a data set with manually assigned topics, the system is found to significantly improve over a simpler baseline that does not make use of the external encyclopedic knowledge.
منابع مشابه
Galaxysearch - Discovering the Knowledge of Many by Using Wikipedia as a Meta-Searchindex
We propose a dynamic map of knowledge generated from Wikipedia pages and the Web URLs contained therein. GalaxySearch provides answers to the questions we don’t know how to ask, by constructing a semantic network of the most relevant pages in Wikipedia related to a search term. This search graph is constructed based on the Wikipedia bidirectional link structure, the most recent edits on the pag...
متن کاملUsing Encyclopedic Knowledge for Automatic Topic Identification
This paper presents a method for automatic topic identification using an encyclopedic graph derived from Wikipedia. The system is found to exceed the performance of previously proposed machine learning algorithms for topic identification, with an annotation consistency comparable to human annotations.
متن کاملEfficient Computation of Relationship-Centrality in Large Entity-Relationship Graphs
Given two sets of entities – potentially the results of two queries on a knowledge-graph like YAGO or DBpedia– characterizing the relationship between these sets in the form of important people, events and organizations is an analytics task useful in many domains. In this paper, we present an intuitive and efficiently computable vertex centrality measure that captures the importance of a node w...
متن کاملA Wikipedia Based Semantic Graph Model for Topic Tracking in Blogsphere
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph, in which...
متن کاملPredicting Central Topics in a Blog Corpus from a Networks Perspective
In today’s content-centric Internet, blogs are becoming increasingly popular and important from a data analysis perspective. According to Wikipedia, there were over 156 million public blogs on the Internet as of February 2011. Blogs are a reflection of our contemporary society. The contents of different blog posts are important from social, psychological, economical and political perspectives. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009